A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity
نویسندگان
چکیده
We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of data-independent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the data-dependent information complexity (also known as stochastic or PAC-Bayesian, KL(posterior ∥prior) complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax log-loss individual-sequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to L2(P ) entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and Cesa-Bianchi who did the log-loss case with L∞. Together, these results recover optimal bounds for VCand large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: ‘easiness’ (Bernstein) conditions and model complexity.
منابع مشابه
une approche PAC-Bayésienne PAC-Bayesian Statistical Learning Theory
This PhD thesis is a mathematical study of the learning task – specifically classification and least square regression – in order to better understand why an algorithm works and to propose more efficient procedures. The thesis consists in four papers. The first one provides a PAC bound for the L generalization error of methods based on combining regression procedures. This bound is tight to the...
متن کاملThéorie Statistique de l’Apprentissage: une approche PAC-Bayésienne PAC-Bayesian Statistical Learning Theory
This PhD thesis is a mathematical study of the learning task – specifically classification and least square regression – in order to better understand why an algorithm works and to propose more efficient procedures. The thesis consists in four papers. The first one provides a PAC bound for the L generalization error of methods based on combining regression procedures. This bound is tight to the...
متن کاملSimplified PAC-Bayesian Margin Bounds
The theoretical understanding of support vector machines is largely based on margin bounds for linear classifiers with unit-norm weight vectors and unit-norm feature vectors. Unit-norm margin bounds have been proved previously using fat-shattering arguments and Rademacher complexity. Recently Langford and Shawe-Taylor proved a dimensionindependent unit-norm margin bound using a relatively simpl...
متن کاملThe Local Rademacher Complexity of Lp-Norm Multiple Kernel Learning
We derive an upper bound on the local Rademacher complexity of lp-norm multiple kernel learning, which yields a tighter excess risk bound than global approaches. Previous local approaches aimed at analyzed the case p = 1 only while our analysis covers all cases 1 ≤ p ≤ ∞, assuming the different feature mappings corresponding to the different kernels to be uncorrelated. We also show a lower boun...
متن کاملThe Local Rademacher Complexity of `p-Norm Multiple Kernel Learning
We derive an upper bound on the local Rademacher complexity of `p-norm multiple kernel learning, which yields a tighter excess risk bound than global approaches. Previous local approaches analyzed the case p = 1 only while our analysis covers all cases 1 ≤ p ≤ ∞, assuming the different feature mappings corresponding to the different kernels to be uncorrelated. We also show a lower bound that sh...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1710.07732 شماره
صفحات -
تاریخ انتشار 2017